Corpus: bel_newscrawl_2011_100K

Other corpora

4.3.1.5 Number of Word-N-grams at Sentence Beginnings

Number of word-N-grams for N=1...5 for the first K sentences


Zipf's diagram for sentence beginnings


Gnuplot diagram

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 75 98 99 99 99
1000 472 943 989 992 994
10000 2746 7896 9601 9884 9933
100000 14275 59205 88679 97015 98840
1000000 14275 59206 88680 97016 98841
4953 msec needed at 2018-02-02 18:23